Explain multioutput predictive models with dalex

This notebook provides examples of working with multiclass classification and other multioutput algorithms, e.g. multioutput regression.

A natural example of such an algorithm is a multilayer perceptron neural network.

For a broad overview of the topic, see a comprehensive introduction in the scikit-learn package's documentation: 1.12. Multiclass and multioutput algorithms.

https://dalex.drwhy.ai/python

Imports

In [1]:
import dalex as dx

import numpy as np
import pandas as pd

from sklearn import datasets
from sklearn.ensemble import RandomForestRegressor
from sklearn.multioutput import MultiOutputRegressor
from lightgbm import LGBMClassifier, LGBMRegressor

import warnings
warnings.filterwarnings('ignore')
In [2]:
dx.__version__
Out[2]:
'1.4.1.9000'

Part 1: treating a multioutput model as multiple singleoutput models

One approach is to use each model's output separately, e.g. the predicted probability for a given class in multiclass classification problem.

Part 1A: Multiclass classification

We will use the iris dataset and the LGBMClassifier model for this example.

In [3]:
# data
X, y = datasets.load_iris(return_X_y=True, as_frame=True)

# model 
model = LGBMClassifier(n_estimators=25)
model.fit(X, y)

# model has 3 outputs
model.predict_proba(X).shape
Out[3]:
(150, 3)

Let's explain the classification for the first class 0. For that, we need to create a custom predict_function.

In [4]:
# custom (binary) predict function
pf_0 = lambda m, d: m.predict_proba(d)[:, 0]

# custom (binary) target values
y_0 = y == 0

# explainer
exp_0 = dx.Explainer(model, X, y_0, predict_function=pf_0, label="LGBMClassifier: class 0")
Preparation of a new explainer is initiated

  -> data              : 150 rows 4 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 150 values
  -> model_class       : lightgbm.sklearn.LGBMClassifier (default)
  -> label             : LGBMClassifier: class 0
  -> predict function  : <function <lambda> at 0x000002496BA75430> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 0.0309, mean = 0.338, max = 0.939
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -0.369, mean = -0.00495, max = 0.199
  -> model_info        : package lightgbm

A new explainer has been created!
In [5]:
exp_0.model_performance()
Out[5]:
recall precision f1 accuracy auc
LGBMClassifier: class 0 1.0 1.0 1.0 1.0 1.0
In [6]:
exp_0.model_parts().plot()
exp_0.model_profile().plot()
Calculating ceteris paribus: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 15.52it/s]

Now, let's explain the classification for all the classes. For that, we can loop over multiple explainers.

In [7]:
exp_list = []

for i in range(len(np.unique(y))):
    # add i parameter to `predict_function` just to do it in a loop
    pf = lambda m, d, i=i: m.predict_proba(d)[:, i]
    e = dx.Explainer(
        model, X, 
        y == i, 
        predict_function=pf,
        label=f'LGBMClassifier: class {i}',
        verbose=False
    )
    exp_list += [e]

exp_list
Out[7]:
[<dalex._explainer.object.Explainer at 0x2496bd66b50>,
 <dalex._explainer.object.Explainer at 0x2496bab7eb0>,
 <dalex._explainer.object.Explainer at 0x2496bcaf1c0>]
In [8]:
# create multiple explanations
m_profile_list = [e.model_profile() for e in exp_list]
Calculating ceteris paribus: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 26.69it/s]
Calculating ceteris paribus: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 23.29it/s]
Calculating ceteris paribus: 100%|███████████████████████████████████████████████████████| 4/4 [00:00<00:00, 25.11it/s]
In [9]:
# plot multiple explanations
m_profile_list[0].plot(m_profile_list[1:])
In [10]:
m_parts_list = [e.model_parts() for e in exp_list]
m_parts_list[0].plot(m_parts_list[1:])

We explain predictions in a same way

In [11]:
# choose a data point to explain
observation = X.iloc[[0]]

p_parts_list = [e.predict_parts(observation) for e in exp_list]
p_parts_list[0].plot(p_parts_list[1:], min_max=[-0.1, 1.1])
In [12]:
p_profile_list = [e.predict_profile(observation) for e in exp_list]
p_profile_list[0].plot(p_profile_list[1:])
Calculating ceteris paribus: 100%|██████████████████████████████████████████████████████| 4/4 [00:00<00:00, 107.53it/s]
Calculating ceteris paribus: 100%|██████████████████████████████████████████████████████| 4/4 [00:00<00:00, 111.92it/s]
Calculating ceteris paribus: 100%|██████████████████████████████████████████████████████| 4/4 [00:00<00:00, 146.29it/s]

Part 1B: Multioutput regression

We will use a custom dataset and two models for this example:

  1. RandomForestRegressor, which supports multioutput directly,
  2. LGBMRegressor with MultiOutputRegressor, which involves creating one model for each output independently.

For details, see 1.10.3. Multi-output problems and scikit-learn: Comparing random forests and the multi-output meta estimator.

In [13]:
# create a toy multiregression example
n_outputs = 4
X, y = datasets.make_regression(n_samples=2000, n_features=7, n_informative=5, n_targets=n_outputs, effective_rank=1, noise=0.5, random_state=1)

# summarize the dataset
print(X.shape, y.shape)
(2000, 7) (2000, 4)
In [14]:
# model
model_rf = RandomForestRegressor()
model_rf.fit(X, y)
model_rf.predict(X).shape
Out[14]:
(2000, 4)

The following code returns an error because the LGBMRegressor model does not support multioutput directly.

In [15]:
try:
    model_gbm = LGBMRegressor()
    model_gbm.fit(X, y)
except Exception as e:
    print(f"Error: {e}")
Error: y should be a 1d array, got an array of shape (2000, 4) instead.
In [16]:
# wrap the second model
model_gbm = MultiOutputRegressor(LGBMRegressor())
model_gbm.fit(X, y)
model_gbm.predict(X).shape
Out[16]:
(2000, 4)

Like in Part 1A, we explain the regression for all the outputs by looping over multiple explainers.

In [17]:
exp_rf_list, exp_gbm_list = [], []

for i in range(n_outputs):
    # add i parameter to `predict_function` just to do it in a loop
    pf = lambda m, d, i=i: m.predict(d)[:, i]

    e_rf = dx.Explainer(
        model_rf, X, 
        y[:, i], 
        predict_function=pf,
        label=f'RF: output {i}',
        verbose=False
    )
    e_gbm = dx.Explainer(
        model_gbm, X, 
        y[:, i], 
        predict_function=pf,
        label=f'GBM: output {i}',
        verbose=False
    )

    exp_rf_list += [e_rf]
    exp_gbm_list += [e_gbm]

exp_rf_list + exp_gbm_list
Out[17]:
[<dalex._explainer.object.Explainer at 0x24959aa1160>,
 <dalex._explainer.object.Explainer at 0x24959a949d0>,
 <dalex._explainer.object.Explainer at 0x24959aa49d0>,
 <dalex._explainer.object.Explainer at 0x2496cf9df10>,
 <dalex._explainer.object.Explainer at 0x24959aa10a0>,
 <dalex._explainer.object.Explainer at 0x24959aa40d0>,
 <dalex._explainer.object.Explainer at 0x24959aa8d00>,
 <dalex._explainer.object.Explainer at 0x24959aa9c70>]

Let's check the models' performance.

In [18]:
m_performance_list = [e.model_performance() for e in exp_rf_list + exp_gbm_list]
pd.concat([mp.result for mp in m_performance_list], axis=0)
Out[18]:
mse rmse r2 mae mad
RF: output 0 0.067262 0.259349 0.978939 0.199693 0.158184
RF: output 1 0.051843 0.227690 0.979209 0.179389 0.149368
RF: output 2 0.052691 0.229546 0.961419 0.180203 0.145498
RF: output 3 0.064739 0.254438 0.980424 0.198300 0.162613
GBM: output 0 0.083923 0.289694 0.973723 0.226224 0.182196
GBM: output 1 0.081460 0.285411 0.967332 0.225156 0.190575
GBM: output 2 0.084242 0.290245 0.938318 0.230180 0.196593
GBM: output 3 0.080827 0.284301 0.975559 0.224692 0.184345
In [19]:
m_performance_list[0].plot(m_performance_list[4:8])

Explain!

In [20]:
m_parts_gbm_list = [e.model_parts(verbose=False) for e in exp_gbm_list]
m_parts_gbm_list[0].plot(m_parts_gbm_list[1::])

If the targets have related values and domains, we could compare their profiles.

In [21]:
m_profile_gbm_list = [e.model_profile(verbose=False) for e in exp_gbm_list]
m_profile_gbm_list[0].plot(m_profile_gbm_list[1::])

Let's compare local explanations.

In [22]:
# choose a data point to explain
observation = X[0, :]

exp_rf_list[1].predict_parts(observation, type="shap").plot(
    exp_gbm_list[1].predict_parts(observation, type="shap")
)

Part 2: adapting explanations for multioutput models

Another approach is to customize explanations specifically for multioutput models.

For example, there are loss_functions specific to multiclass classification, which we can substitute when calculating feature importance. We will use the iris dataset and the LGBMClassifier model for this example.

In [23]:
# data
X, y = datasets.load_iris(return_X_y=True, as_frame=True)

# model 
model = LGBMClassifier(n_estimators=25)
model.fit(X, y)

# model has 3 outputs
model.predict_proba(X).shape
Out[23]:
(150, 3)
In [24]:
# predict function
pf = lambda m, d: m.predict_proba(d)

# explainer
exp = dx.Explainer(model, X, y, predict_function=pf, label="LGBMClassifier: multioutput")
Preparation of a new explainer is initiated

  -> data              : 150 rows 4 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 150 values
  -> model_class       : lightgbm.sklearn.LGBMClassifier (default)
  -> label             : LGBMClassifier: multioutput
  -> predict function  : <function <lambda> at 0x0000024959BEE670> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 0.0281, mean = 0.333, max = 0.939
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         :  'residual_function' returns an Error when executed:
operands could not be broadcast together with shapes (150,) (150,3) 
  -> model_info        : package lightgbm

A new explainer has been created!

Please note the above warning:

-> residuals : 'residual_function' returns an Error when executed: operands could not be broadcast together with shapes (150,) (150,3)

It is due to the default residual_function expecting predict_function to return a numpy.ndarray (1d). With a wrong predict_function and others, we obtain the following Errors:

In [25]:
try:
    exp.model_performance()
except Exception as e:
    print(f"Error: {e}")

try:
    exp.model_parts()
except Exception as e:
    print(f"Error: {e}")

try:
    exp.model_profile(verbose=False)
except Exception as e:
    print(f"Error: {e}")
Error: operands could not be broadcast together with shapes (150,) (150,3) 
Error: Per-column arrays must each be 1-dimensional
Error: Expected a 1D array, got an array with shape (15150, 3)

Let's define a custom loss_function to work with model_parts, e.g. we can use Cross-entropy loss.

In [26]:
def loss_cross_entropy(y_true, y_pred):  
    ## for loop for code clarity - could be optimized
    probs = [0]*len(y_true)
    for i, val in enumerate(y_true):
        probs[i] = y_pred[i, val]
    return np.sum(-np.log(np.maximum(probs, 0.0000001)))
In [27]:
# test if it works
loss_cross_entropy(exp.y, exp.predict(exp.data))
Out[27]:
22.499995698890928

Explain! We obtain one global explanation for the multioutput model.

In [28]:
mp = exp.model_parts(loss_function=loss_cross_entropy)
In [29]:
mp.result
Out[29]:
variable dropout_loss label
0 _full_model_ 22.499996 LGBMClassifier: multioutput
1 sepal length (cm) 22.742525 LGBMClassifier: multioutput
2 sepal width (cm) 23.186070 LGBMClassifier: multioutput
3 petal width (cm) 55.545925 LGBMClassifier: multioutput
4 petal length (cm) 160.798938 LGBMClassifier: multioutput
5 _baseline_ 311.231587 LGBMClassifier: multioutput
In [30]:
mp.plot()

Let's define a custom residual_function to work with model_diagnostics.

In [31]:
# residual function
def residual_multiclass(model, data, y_true):  
    # convert target to one-hot encoding
    y_true_ohe = np.zeros((len(y_true), max(y_true)+1))
    y_true_ohe[np.arange(len(y_true)), y_true] = 1
    
    y_pred = model.predict_proba(data)

    # examplary definition of a residual in multiclass
    residuals = (y_true_ohe - y_pred).max(axis=1)
    
    return residuals

residual_multiclass(model, X, y)
Out[31]:
array([0.06527148, 0.08003894, 0.08007315, 0.07883426, 0.06311306,
       0.06284237, 0.06381608, 0.06209995, 0.08004891, 0.07883426,
       0.06422609, 0.06309267, 0.08003894, 0.08003894, 0.06527148,
       0.06338463, 0.06443012, 0.06443012, 0.06284237, 0.06338463,
       0.0636695 , 0.06338463, 0.06467162, 0.19946586, 0.18312232,
       0.07691005, 0.06075906, 0.06422609, 0.06527148, 0.07814762,
       0.07814762, 0.06338463, 0.06422609, 0.06527148, 0.07883426,
       0.0781827 , 0.06527148, 0.06467162, 0.08003894, 0.06422609,
       0.06229767, 0.08140449, 0.08007315, 0.16864026, 0.18658398,
       0.08003894, 0.0636695 , 0.08007315, 0.06422609, 0.07391836,
       0.11051387, 0.13293178, 0.32718289, 0.08230225, 0.15679968,
       0.11729504, 0.13293178, 0.09639225, 0.10644051, 0.09244473,
       0.09424732, 0.10052819, 0.09039836, 0.12019645, 0.0744245 ,
       0.07767461, 0.14999645, 0.09091958, 0.15972139, 0.08230225,
       0.66221966, 0.07801798, 0.36363436, 0.1160211 , 0.0761035 ,
       0.08538487, 0.21941339, 0.82251364, 0.14432143, 0.09146706,
       0.08235181, 0.09423469, 0.07935254, 0.60814822, 0.14999645,
       0.13293178, 0.13026772, 0.08441336, 0.07437116, 0.08230225,
       0.08797487, 0.12019645, 0.07895051, 0.09424732, 0.08614334,
       0.07499571, 0.07500534, 0.07607415, 0.39691334, 0.07891189,
       0.06816928, 0.11820603, 0.07156775, 0.07159839, 0.06640973,
       0.07156775, 0.5116304 , 0.07715461, 0.07672906, 0.07355273,
       0.12069888, 0.09788906, 0.0756484 , 0.17300273, 0.11820603,
       0.0999972 , 0.07582626, 0.07355273, 0.07117047, 0.60773795,
       0.07355273, 0.21252498, 0.07156775, 0.23114094, 0.07355273,
       0.07928045, 0.341382  , 0.24553454, 0.06640973, 0.31239242,
       0.07156775, 0.07355273, 0.06640973, 0.46648616, 0.34078938,
       0.07156775, 0.06816928, 0.07582626, 0.35982812, 0.0756484 ,
       0.07156775, 0.1260149 , 0.11820603, 0.07355273, 0.07355273,
       0.10452841, 0.17300273, 0.09788906, 0.0721153 , 0.12694513])
In [32]:
# explainer
exp_res = dx.Explainer(
    model, X, y, 
    # by default, predict_function needs to return a (1d) numpy.ndarray
    predict_function=lambda m, d: m.predict(d), 
    residual_function=residual_multiclass, 
    label="LGBMClassifier: multioutput"
)
Preparation of a new explainer is initiated

  -> data              : 150 rows 4 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 150 values
  -> model_class       : lightgbm.sklearn.LGBMClassifier (default)
  -> label             : LGBMClassifier: multioutput
  -> predict function  : <function <lambda> at 0x0000024959B80E50> will be used
  -> predict function  : Accepts pandas.DataFrame and numpy.ndarray.
  -> predicted values  : min = 0.0, mean = 1.01, max = 2.0
  -> model type        : classification will be used (default)
  -> residual function : <function residual_multiclass at 0x0000024959B80AF0>
  -> residuals         : min = 0.0608, mean = 0.126, max = 0.823
  -> model_info        : package lightgbm

A new explainer has been created!
In [33]:
exp_res.model_diagnostics().plot(variable="petal length (cm)")

Plots

This package uses plotly to render the plots:

Resources - https://dalex.drwhy.ai/python